46 research outputs found

    Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization

    Full text link
    We present a graph-based variational algorithm for classification of high-dimensional data, generalizing the binary diffuse interface model to the case of multiple classes. Motivated by total variation techniques, the method involves minimizing an energy functional made up of three terms. The first two terms promote a stepwise continuous classification function with sharp transitions between classes, while preserving symmetry among the class labels. The third term is a data fidelity term, allowing us to incorporate prior information into the model in a semi-supervised framework. The performance of the algorithm on synthetic data, as well as on the COIL and MNIST benchmark datasets, is competitive with state-of-the-art graph-based multiclass segmentation methods.Comment: 16 pages, to appear in Springer's Lecture Notes in Computer Science volume "Pattern Recognition Applications and Methods 2013", part of series on Advances in Intelligent and Soft Computin

    Building multiclass classifiers for remote homology detection and fold recognition

    Get PDF
    BACKGROUND: Protein remote homology detection and fold recognition are central problems in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for solving these problems. These methods are primarily used to solve binary classification problems and they have not been extensively used to solve the more general multiclass remote homology prediction and fold recognition problems. RESULTS: We present a comprehensive evaluation of a number of methods for building SVM-based multiclass classification schemes in the context of the SCOP protein classification. These methods include schemes that directly build an SVM-based multiclass model, schemes that employ a second-level learning approach to combine the predictions generated by a set of binary SVM-based classifiers, and schemes that build and combine binary classifiers for various levels of the SCOP hierarchy beyond those defining the target classes. CONCLUSION: Analyzing the performance achieved by the different approaches on four different datasets we show that most of the proposed multiclass SVM-based classification approaches are quite effective in solving the remote homology prediction and fold recognition problems and that the schemes that use predictions from binary models constructed for ancestral categories within the SCOP hierarchy tend to not only lead to lower error rates but also reduce the number of errors in which a superfamily is assigned to an entirely different fold and a fold is predicted as being from a different SCOP class. Our results also show that the limited size of the training data makes it hard to learn complex second-level models, and that models of moderate complexity lead to consistently better results

    A preliminary laboratory study on the salinity and temperature tolerances of some medusae from the São Paulo coast, Brazil

    Get PDF
    The salinity and temperature tolerances of some species of medusae were studied in the laboratory. The results showed the following order of tolerances in diluted seawater: Cirrholovenia tetranema, Clytia cylindrica and Eucheilota duodecimalis > Proboscidactyla ornata and Obelia spp. > Euphysora gracilis, Ectopleura dumortieri, Liriope tetraphylla and Cunina octonaria. In relation to the decrease of temperature, the following results were obtained: Ectopleura dumortieri, Euphysora gracilis, Obelia spp. and Proboscidactyla ornata > Liriope tetraphylla > Cunina octonaria > Clytia cylindrica and Eucheilota duodecimalis. The results obtained in laboratory were discussed in relation to the distribution of the species in nature.A tolerância de varias espécies de medusas a valores decrescentes de salinidade e temperatura foi estudada em laboratório. Os resultados mostraram a seguinte ordem de tolerância em relação á água do mar diluída: Cirrholovenia tetranema, Clytia cylindrica e Eucheilota duodecimalis > Proboscidactyla ornata e Obelia spp. > Euphysora gracilis, Ectopleura dumortieri, Liriope tetraphylla e Cunina octonaria. Em relação á diminuição de temperatura, os seguintes resultados foram obtidos: Ectopleura dwrortieri, Euphysora gracilis, Obelia spp. e Proboscidactyla ornata > Liriope tetraphylla > Cunina octonaria > Clytia cylindrica e Eucheilota duodecimalis. Estes resultados obtidos em laboratório foram discutidos levando-se em conta a distribuição dessas espécies na natureza

    Multiclass classification of microarray data samples with a reduced number of genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained.</p> <p>Results</p> <p>A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples.</p> <p>Conclusions</p> <p>A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples.</p

    Maximizing upgrading and downgrading margins for ordinal regression

    Get PDF
    In ordinal regression, a score function and threshold values are sought to classify a set of objects into a set of ranked classes. Classifying an individual in a class with higher (respectively lower) rank than its actual rank is called an upgrading (respectively downgrading) error. Since upgrading and downgrading errors may not have the same importance, they should be considered as two different criteria to be taken into account when measuring the quality of a classifier. In Support Vector Machines, margin maximization is used as an effective and computationally tractable surrogate of the minimization of misclassification errors. As an extension, we consider in this paper the maximization of upgrading and downgrading margins as a surrogate of the minimization of upgrading and downgrading errors, and we address the biobjective problem of finding a classifier maximizing simultaneously the two margins. The whole set of Pareto-optimal solutions of such biobjective problem is described as translations of the optimal solutions of a scalar optimization problem. For the most popular case in which the Euclidean norm is considered, the scalar problem has a unique solution, yielding that all the Pareto-optimal solutions of the biobjective problem are translations of each other. Hence, the Pareto-optimal solutions can easily be provided to the analyst, who, after inspection of the misclassification errors caused, should choose in a later stage the most convenient classifier. The consequence of this analysis is that it provides a theoretical foundation for a popular strategy among practitioners, based on the so-called ROC curve, which is shown here to equal the set of Pareto-optimal solutions of maximizing simultaneously the downgrading and upgrading margins

    Calibrating Margin-Based Classifier Scores into Polychotomous Probabilities

    No full text
    corecore